TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play

نویسنده

  • Gerald Tesauro
چکیده

TD-Gammon is a neural network that is able to teach itself to play backgammon solely by playing against itself and learning from the results, based on the TD(X) reinforcement learning algorithm (Sutton 1988). Despite starting from random initial weights (and hence random initial strategy), TD-Gammon achieves a surprisingly strong level of play. With zero knowledge built in at the start of learning (i.e., given only a ”raw” description of the board state), the network learns to play at a strong intermediate level. Furthermore, when a set of handcrafted features is added to the network’s input representation, the result is a truly staggering level of performance: the latest version of TD-Gammon is now estimated to play at a strong master level that is extremely close to the world’s best human players.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Programming backgammon using self-teaching neural nets

TD-Gammon is a neural network that is able to teach itself to play backgammon solely by playing against itself and learning from the results. Starting from random initial play, TD-Gammon’s selfteaching methodology results in a surprisingly strong program: without lookahead, its positional judgement rivals that of human experts, and when combined with shallow lookahead, it reaches a level of pla...

متن کامل

Feature Construction for Reinforcement Learning in Hearts

Temporal difference (TD) learning has been used to learn strong evaluation functions in a variety of two-player games. TD-gammon illustrated how the combination of game tree search and learning methods can achieve grand-master level play in backgammon. In this work, we develop a player for the game of hearts, a 4-player game, based on stochastic linear regression and TD learning. Using a small ...

متن کامل

Backgammon , anyone ? Neural learning theory tested

Madison, Wis. A computer took on a human grand master and won 99 of 100 games at the American Association of Artificial Intelligence meeting held here recently. Though the system used was developed at IBM Corp.'s T.J. Watson Research Center, this time the game was not chess, but backgammon. Also, unlike Deep Blue, the machine playing was not running a conventional computer program, but a neural...

متن کامل

DRAFT GP-Gammon: Genetically Programming Backgammon Players

We apply genetic programming to the evolution of strategies for playing the game of backgammon. We explore two different strategies of learning: using a fixed external opponent as teacher, and letting the individuals play against each other. We conclude that the second approach is better and leads to excellent results: Pitted in a 1000-game tournament against a standard benchmark player—Pubeval...

متن کامل

Why did TD-Gammon Work?

Although TD-Gammon is one of the major successes in machine learning, it has not led to similar impressive breakthroughs in temporal difference learning for other applications or even other games. We were able to replicate some of the success of TD-Gammon, developing a competitive evaluation function on a 4000 parameter feed-forward neural network, without using back-propagation, reinforcement ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Neural Computation

دوره 6  شماره 

صفحات  -

تاریخ انتشار 1994